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Abstract 

S&P 500 index data sampled at one-minute intervals over the course of 11.5 years (Jan- 
uary 1989- May 2000) is analyzed, and in particular the Hurst parameter over segments of 
stationarity (the time period over which the Hurst parameter is almost constant) is esti- 
mated. An asymptotically unbiased and efficient estimator using the log-scale spectrum is 
employed. The estimator is asymptotically Gaussian and the variance of the estimate that 
is obtained from a data segment of N points is of order -A*. Wavelet analysis is tailor made 
for the high frequency data set, since it has low computational complexity due to the pyra- 
midal algorithm for computing the detail coefficients. This estimator is robust to additive 
non-stationarities, and here it is shown to exhibit some degree of robustness to multiplica- 
tive non-stationarities, such as seasonalities and volatility persistence, as well. This analysis 
shows that the market became more efficient in the period 1997-2000. 
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1 Introduction 



Stochastic models based primarily on continuous or discrete time random walks have been the 
foundation of financial engineering since they were introduced in the economics literature in the 
1960s. Such models exploded in popularity because of the successful option pricing theory built 
around them by Black and Scholes [13] and Cox et al. [15], as well as the simplicity of the 
solution of associated optimal investment problems given by Merton |33j. 

Typically, models used in finance are diffusions built on standard Brownian motion and they 
are associated with partial differential equations describing corresponding optimal investment or 
pricing strategies. At the same time, the failure of models based on independent increments to 
describe certain financial data has been observed since Greene and Fielitz [21] and Mandelbrot 
[51] . and [30]. Using R/S analysis, Greene and Fielitz studied 200 daily stock returns of securities 
listed on the New York Stock Exchange and they found significant long range dependence. 
Contrary to their finding, Lo [27J, using a modified R/S analysis designed to compensate for the 
presence of short-range dependence, finds no evidence of long-range dependence (LRD). However, 
Teverovsky et al. [36] and Willinger et al. [37] identified a number of problems associated with 
Lo's method. In particular, they showed that Lo's method has a strong preference for accepting 
the null hypothesis of no long range dependence. This happens even with long-range dependent 
synthetic data. To account for the long-range dependence observed in financial data Cutland 
et al. [16J proposed to replace Brownian motion with fractional Brownian motion (fBm) as the 
building block of stochastic models for asset prices. An account of the historical development of 
these ideas can be traced from Cutland et al [15] . Mandelbrot [32] and Shiryaev [33]. The S&P 
500 index was analyzed in [37] and [38] by Peters using R/S analysis, and he concluded that the 
raw return series exhibits long-range dependence. See also [23] for analysis of LRD in German 
stock indices. 

Here we present a study of a high-frequency financial data set exhibiting long-range depen- 
dence, and develop wavelet based techniques for its analysis. In particular we examine the S&P 
500 over 11.5 years, taken at one-minute intervals. The wavelet tool we consider, namely the 
log-scale spectrum method, is asymptotically unbiased and efficient with a vanishing precision 
error for estimating the Hurst parameter (a measure of long-range dependence, explained in ([2]) 
below). (See Theorem 12.11 ) Since we are dealing with high frequency data, we need fast algo- 
rithms for the processing of the data. Wavelet analysis is tailor-made for this purpose due to 
the pyramidal algorithm, which calculates the wavelet coefficients using octave filter banks. In 
essence, we look at a linear transform of the logarithm of the wavelet variance (i.e. the variance 
of the detail coefficients, defined in Q) to estimate the Hurst parameter. Moreover, the log- 
scale spectrum methodology is insensitive to additive non-stationarities, and, as we shall see, 
it also exhibits robustness to multiplicative non-stationarities of a very general type including 
seasonalities and volatility persistence (Section 12. 4ft . 

Although the Hurst parameter of S&P 500 data considered here is significantly above the 
efficient market value of H = |, it began to approach that level around 1997. This behavior 
of the market might be related to the increase in Internet trading, which has the three-fold 
effect of increasing the number of small traders, increasing the frequency of trading activity, 
and improving traders' access to price information. An analytical model of this observation is 
proposed in [TU] . 
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1.1 Fractional Brownian Motion 



A natural extension of the conventional stochastic models for security prices to incorporate 
long-range dependence is to model the price series with geometric fractional Brownian motion: 



P t = Po exp [fH+ °sdB? , (1) 



o 

where Po is today's observed price, \x is a growth rate parameter, a is the stochastic volatility 
process, and B H is a fractional Brownian motion, an almost surely (a.s.) continuous and centered 
Gaussian process with stationary increments, autocorrelation of B H 

E{B»B?} = 1 -(\tr+\sr-\t-sn, ( 2 ) 

where H G (0, 1] is the so-called Hurst parameter. (Note that H = ^ gives standard Brownian 
motion.) From this definition, it is easy to see that fBm is self-similar, i.e. B H (at) = a H B(t), 
where the equality is in the sense of finite dimensional distributions. This model for stock 
market prices is a generalization of the model proposed in [16] to allow for non-Gaussian returns 
distribution into the model. Heavy tailed marginals for stock price returns have been observed 
in many empirical studies since the early 1960's by Fama [20J and Mandelbrot |29j . 

Fractional Brownian motion models are able to capture long range dependence in a parsimo- 
nious way. Consider for example the fractional Gaussian noise Z(k) := B H (k) — B H (k — 1). The 
auto-correlation function of Z, which is denoted by r, satisfies the asymptotic relation 

r(k) ~ r(0)H(2H- l)k 2H - 2 , as k -> oo. (3) 

For H G (1/2,1], Z exhibits long-range dependence, which is also called the Joseph effect in 
Mandelbrot's terminology [32]. For H = 1/2 all correlations at non-zero lags are zero. For 
H G (0, 1/2) the correlations are summable, and in fact they sum up to zero. The latter case is 
less interesting for financial applications (|16|). 

Now, we will make the meaning of ([T]) clear by defining the integral term. The stochastic 
integral in ([I]) is understood as the probabilistic limits of Stieltjes sums. That is, given stochastic 
processes Y and X, such that Y is adapted to the filtration generated by X, we say that 
the integral jYdX exists if, for every t < oo, and for each sequence of partitions {& n }n£N, 
a n = (T™,Tg,...,Tg n ), of the interval [0,t] that satisfies lim n ^ QO ,max i \T l n +1 - T t n \ = 0, the 

sequence of sums (j>2i Yt™{Xt™ +1 — Xtt 1 )) converges in probability. That is, we define 

I Y s dX s =P- lim y^Y T n(X T n - X T n). (4) 

By the Bichteler-Dellacherie Theorem |39] one can see that the integrals of adapted processes 
with respect to fBm may not converge in probability. However when H > I there are two families 
of processes that are integrable with respect to fBm that are sufficiently large for modeling 
purposes. The first family consists of continuous semi-martingales adapted to the filtration of 
fBm as demonstrated in [8]. The second family consists of processes with Holder exponents 
greater than 1 — H. (This integration can be carried out pathwise as demonstrated in [41], and 
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1.2 Markets with Arbitrage Opportunities 

Much of finance theory relies on the assumption that markets adjust prices rapidly to exclude any 
arbitrage opportunities. It is well known that models based on fBm allow arbitrage opportunities 
( |14j and [10]). Even in the case of stochastic a we have shown that there exist arbitrage 
opportunities in a single stock setting [8j. However, strategies that capitalize on the smoothness 
(relative to standard Bm) and correlation structure of fBm to make gains with no risk, involve 
exploiting the fine-scale properties of the process' trajectories. Therefore, this kind of model 
describes a market where arbitrage opportunities can be realized (by frequent trading), which 
seems plausible in real markets. But the ability of a trader to implement this type of strategy is 
likely to be hindered by market frictions, such as transaction costs and the minimal amount of 
time between two consecutive transactions. Indeed Cheridito [I4j showed that by introducing a 
minimal amount of time h > between any two consecutive transactions, arbitrage opportunities 
can be excluded from a geometric fractional Brownian motion model (i.e when a is taken to be 
constant in ([I])). 

Elliot and Van der Hoek [19], and Oksendal and Hu [35J considered another fractional Black- 
Scholes (B-S) model by defining the integrals in ([1]) as Wick type integrals. This fractional B-S 
model does not lead to arbitrage opportunities; however one can argue that it is not a suitable 
model for stock price dynamics. The Wick type integral of a process Y with respect to a process 
X is defined as 

/ Y T nO(X T n -X T n) (5) 

Jo 

where the convergence is in the L 2 space of random variables. (The Wick product is defined 
using the tensor product structure of L 2 ; see [25].) The Wick type integral of Y with respect to 
fBm with Hurst parameter H is equal to the Stieltjes integral defined above plus a drift term 
(see p2] Thm. 3.12), 

t rt rt 

Y s dB^ = / Y,odB?+ / DfY s ds, 
Jo Jo 

where 0(s,i) = H(2H - l)\s - t\ 2H ~ 2 , and DfY t := {D^Y t ){s) is the Hida derivative of the 
random variable Yf. Hence writing an integral equation in terms of Wick product integrals is 
equivalent to writing a Stieltjes differential equation with a different drift term. The fractional B- 
S model with the integrals defined as in ([5|) does not lead to arbitrage opportunities. However, 
this conclusion is based on the redefinition of the class of self-financing strategies. The self- 
financing strategies in a Stieltjes framework are no longer self-financing strategies in a Wick 
framework, so that all the self-financing arbitrage strategies of the Stieltjes framework are ruled 
out by the approach of [19] and [35] . However in the Wick framework it is hard to give economic 
interpretations to trading strategies. For illustration let us consider a simple hold strategy. Let 
u denote the number of shares that are held at time T\ by an economic agent, and let us see 
the value change of the portfolio over the time interval [Tx,T2] if the agent chooses to hold its 
shares of the risky asset in a Wick type framework. If Pt denotes the price of the risky asset at 
time t, the increment of the value of the portfolio over the interval [TijT^) is 

uo(P T2 -P Tl ). (6) 

It is hard to attach a clear economic meaning to this quantity since the Wick product is not a 
path-wise product but rather is defined using the tensor product structure of the L 2 space of 
random variables. On the other hand, involves the actual realization of the increment, 

u(u)(P T2 (u;)-P Tl (u;)), (7) 



L 
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which has a clear economic interpretation. (Here, u denotes the point in the sample space 
corresponding to the given realization of the price process.) So, among the two candidates for 
the value of the increment of a simple hold strategy, ([7]) has a more direct economic meaning. 
Hence the no arbitrage conclusion of [19] and [35] cannot be interpreted within the usual meaning 
of this term, and thus we prefer to apply the definition ([4]) for the stochastic integrals involved. 
(Also see |12] and [44] which also argue that Wick type integrals are not suitable for defining 
trading strategies.) 

Models with fBm differentials in the stochastic differential equations describing the stock price 
can be built however, by considering the Nash-equilibrium which arises from a game in which 
the players are instituional investors manipulating the coefficients of a stochastic differential 
equation with fBm differentials in order to maximize their utilities. The Nash-equilibrium for 
such stochastic differential games is considered by Bayraktar and Poor in [9\. The fBm differen- 
tials in the controlled stochastic differential game can be interpreted as the trading noise arising 
from the activities of small investors who exhibit inertia (see |10j). 

1.3 Non-stationarities Expected from a Financial Time Series 
1.3.1 Time- Variation of H 

In this paper, we are interested in the estimation of the Hurst parameter (H) from historical 
stock index data. In addition, we will study the variability of this parameter over time. Common 
experience with financial data suggests that it exhibits too much complexity to be described by 
a model as simple as ([I]), which says that the log price process 



is a stochastic integral with respect to fBm with drift. In particular, if we could remove the 
non-stationarities due to the drift and stochastic volatility, then Yt would be a process with sta- 
tionary increments. However, stationarity is not usually a property of return series of a financial 
index, which are often extremely turbulent. Therefore, we would like to identify segments of 
time over which the return series is close to stationary. In other words, one of our aims is to 
study the variation of H over time as a gauge of the epochs when the returns process behaves 
like a stationary process. We do not assume any particular form of temporal behavior for this 
parameter; its variation is to be found from our data analysis. We partition the data into smaller 
segments, find the corresponding parameters for each of the segments, and use filtering to re- 
move the extrinsic variation in the parameters due to finiteness of the segment. We vary the 
segmentation size and repeat the procedure described. Then, comparing the fluctuations of H 
among different segmentation levels, we are able to come up with the segments of stationarity. 
The comparison among the different levels of segmentation is possible since extra noise intro- 
duced by altering the segmentation level is filtered out. In Section 5, we show how we come to 
the conclusion that the segments of stationarity for the S&P 500 are on the order of 2 14 points, 
or approximately 8 weeks. 





(8) 
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1.3.2 Drift and Stochastic Volatility 



We can also allow the average growth rate fi in (pQ) to have time variation. Our analysis is 
insensitive to polynomial trends of certain order. Our method, based on the analysis of the 
log-scale spectrum, is also insensitive to additive periodic components. The effects of periodicity 
on the scale spectrum, and a technique for alleviating the polluting effects of additive periodicity 
by increasing the number of vanishing moments of a mother wavelet is analyzed by Abry et al. 
PQ on fBm with H = 0.8 and an additive sinusoidal trend. 

Here we are interested in S&P 500 data, for which the returns have a multiplicative periodic 
component and stochastic volatility in addition to their intrinsic random variation. The existence 
of seasonalities is observed in various financial time series: see [7] for a single stock return series, 
[22] for S&P 500 index data, and [I] and [5] for FEX data. Heavy tailed marginals for stock 
price returns have been observed in many empirical studies since the early 1960's (e.g., |21j,|29j). 
So we expect to have stochastic volatilitjQ in (pQ) as well. Therefore we must take into account 
these non-stationarities in the data while developing an estimation procedure. One way of 
dealing with seasonalities is given in [3] and [23] . Here we show that the scale spectrum method 
is quite insensitive to multiplicative nonstationarities as well as to volatility persistence. 

The remainder of this paper is organized as follows. In Section 2, we introduce our estima- 
tion technique and discuss its statistical and robustness properties. In Section 3 we apply our 
technique to S&P 500 index data and discuss our observations. 



2 The Log-Scale Spectrum Methodology 

In the Appendix we provide a brief introduction to wavelets (following the treatment by Mallat 
[28] ). and the pyramidal algorithm and its initialization. Henceforth we will assume the notation 
introduced in Section [A. 1[ 

For (j, k) £ 1? let dj (k) denote the detail coefficient at scale j and shift k of a random process 

Z: 

/+oo 
ip(2-H - k)Z(t) dt, (9) 
-00 

where ip is any function satisfying the vanishing moments condition (Appendix A. 1.1, (|27p ) for 
some p £ Z+. 

The empirical variance as a function of the scale parameter j S Z is called the scale spectrum 
and is given by 

N/2i 

S i = N/2jt2 1^)] 2 ' 3 < hg 2 (N), (10) 

where N is the number of initial approximation coefficients, i.e. for j = 0. 

1 Observe, however, that equation |T} should not be read in the same way as financial models driven by standard 
Brownian motion, since the stochastic integral J adB H is not a martingale. Moreover, a 2 is not the quadratic 
variation process of this integral, and should not be viewed as the volatility process literally. 
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2.1 Scale Spectra and fBm 



The detail coefficients of fBm satisfy the following, 

IE{[dj(k)] 2 } = K(H)2^ 2H+1 ^ , 

where K(H) is given by 

K(H) 



-2H 



(11) 



(12) 



(2H" + l)(2JT + 2)' 

as given in [2], for example. The behavior in (jllH suggests that the empirical variance of the 



sequence (dj(k)) 
of fBm satisfies 



k<N/2i 



can be used to estimate the Hurst parameter H. The empirical variance 
E{Sj} = JE{[dj(k)] 2 } = K(H)aH 2H+x V. 



We immediately see that the slope of the log scale spectrum log(Sj) yields a simple estimator 
of the Hurst parameter, but as we shall see one can do better than this simple estimator. 

2.2 Synthetically Generated fBm and Corresponding Log-scale Spectra 

In this section, we illustrate the behavior of the log-spectrum on synthetic data. We use the 
method of Abry and Sellan [2] to generate a realization of 2 19 points of fBm with H = 0.6. This 
method uses wavelets for the synthesis of fBm and requires the specification of the number of 
scales, which we choose to be 20. This simulation method is extremely fast, which is important 
for our purposes since we need on the order of 1 million data points to carry out our synthetic 
analysis. 




Figure 1: A realization of fBm (H = 0.6) of 2 points created using 20 scales 



We use segments of length 2 



15 



i.e., 



2 minutes), and the estimates of the Hurst parameter 



over each segment for the case of fBm are shown in Fig. [3l The associated log-scale spectra are 
shown in Fig. [2 (The mean of the estimates of H over the segments is 0.5928, and the standard 
deviation(std) is 0.0149.) 
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Figure 2: Log-scale spectra for segments of length 2 for the realization of fBm in Fig. Q] 




Segmonts(i) 



Figure 3: Hurst estimates for the realization of fBm given in Fig. [T] when the segment lengths 
are 2 15 



In the next section, we will analyze the asymptotic properties of the logarithm of the scale 
spectrum when a in ([8]) is taken to be constant, and we will develop an asymptotically efficient 
estimator using these results. Then in the following section, using the path properties of the 
integrals with respect to fBm, we will discuss the robustness of this estimator to stochastic 
volatility and seasonalities. 



2.3 Asymptotic Distribution of the Logarithm of the Scale Spectrum 

We generalize the method developed by Papanicolaou and Solna [36] to our case when the 
process to be analyzed is given by (JSj) with a constant. The treatment of [36] was concerned 
with Kolmogorov turbulence for which H is around |, and therefore the use of a Haar wavelet 
in ([9]) suffices. In the S&P 500 data, H is expected to be greater than or equal to ^. Below 
we show that for any H 6 (0, 1) using any function ifj with two vanishing moments in ([9]) is 



sufficient for obtaining an asymptotically Gaussian wavelet variance series (llOp . 



Theorem 2.1 Assume that Z in is given by (Q|), with = and a constant, and the 
analyzing wavelet in (0|) has compact support and has vanishing moments of order at least 2. 
Then the logarithm of the scale spectrum U0\) , i.e log 2 (5j), is asymptotically normally distributed 
and satisfies the following asymptotic relation (as N — > oo ): 

log 2 S 3 ~ log 2 (a 2 K(H)) + j(2H + 1) + JL\ j = 1, ...,log 2 (iV) (13) 

where Nj = N/2? , K(H) is given by jTS|). e 3 - is jV(0, 1) and 

COf | -^=, -P= I ~ -r^, (14) 

as AT — > oo. 

First we will state a central limit theorem for heteroskedastic random variables: 

Lemma 2.1 (Berry-Essen Theorem(see 14$ )) Suppose yi,y2, ■■■■sUn are independent random 
variables such that 

JE{ yi } = 0, E{y 2 l } = af and E{\yf\} = Pi , 

and define 



s n = Yl a i and rn = Yl Pi ' 

i=l i=l 

Let F n denote the distribution function of the normalized sum Y17=iyi/ Sn - Then 

\F n {x)-${x)\ <6^, 

where denotes the M(0, 1) distribution function. 

Proof of Theorem [177V 

First we will use the Berry-Essen Theorem to show that Sj given by (|10p is asymptotically 
normal (N — > oo) with mean proportional to 2^ 2H+1 \ Let Nj = N /2? , and denote the vector 
of scale coefficients at scale j by d? = [dj(l), ...,dj(Nj)] T . Also denote the covariance matrix of 
d? by C. Note that d? has the same law as C 1//2 n, where rj is a vector of independent A/(0, 1) 
random variables. Let A be the matrix that diagonalizes C, i.e. A T CA = A, where A is the 
matrix of eigenvalues of C. Since £ = Arj has the same distribution as rj, we have 

NjSj = d? T aV i 77^77 ^ £ T A£ = £ Aj£ 4 2 . 

i 

On denoting A = A/JE{Sj}, we have 

S j =E{S j }(l + w £H&-l))- 
^ i i=i ' 
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Define 



N i ■ 

J 1=1 

where yi = Aj(£ 2 — 1). The y[s are independent random variables with the following properties: 

lE{ yi } = 0, JE{y 2 } = 2\ 2 end E{\yf\} < 28A? 5 

which are easily derived from the fact that the moments of a J\f(0, a 2 ) random variable are given 
by the following expression: 



^ j = ^> it' for j even, 



2" 

and the odd moments are zero. By the Berry-Essen Theorem, it is sufficient to show that 

j _ 2_a=l A i 



[e£i\ ? ] 3/2 ' 



^{djir^djin- k)} 



is small for large iV,-. We first analyze the denominator 

En,m(Cnm) 2 = N±(± 

where 
Let us introduce 

J v J fc=0 7 

We will now show that p(k) decays as k~ 2p+2H (so that 1(H) is a constant), where p is the 
number of vanishing moments and H is the Hurst exponent of the fBm. We can write 

JE{dj(n)dj(n - k)} = 2~V 2 iE;j J dx J dt ip j)n {x)ip jtn . k {t)B H (x)B H {t)^. 

By ([2]) and Fubini's Theorem we have 

]E{dj(n)dj(n — k)} = 2 _J_1 cr 2 J dx j dt^j jin (x)ij jin - k (t)(\t\ 2H + \x\ 2H - 2\t - x\ 2H ) , 

and since J tpj^(x) dx = for all (k,j) then we have 

E{dj(n)dj(n - k)} = -2~ j a 2 dx dtTp j<n (x)7p jjn _ k (t)\t - x\ 2H 



(15) 

_ 2 j(2H+l) a 2 I dt I dxl p{t)^( x )\t- X + k\ 2H . 
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Using Taylor's formula we have, 



r — x\ , v-^ r(2i/ + l) 



= 1+ E 



/,. j - • ^ r(2ff-? + i)r( g + i) v 

(16) 

r ^ + 1 > -e(t,x,k)( t ~ xX2 ' 



T(2H - p +i)r(p + l) v ' y V fc 

where 9(t,x,k) < (1 + a/A;) and where a is the support length of the analyzing wavelet. Using 
(|15() and the facts that the mother wavelet ifi has vanishing moments of order p and is compactly 
supported, we conclude that p(k) decays as k~ 2p+2H . (In (|16p . T denotes the gamma function.) 
Therefore, 

JV,-1 



1(H) = lim (A £ (ty - %(A:) 2 - 2) 



3 k=0 

is constant if p > 2. (Note that for any self-similar stationary increment processes with self- 
similarity parameter H, i.e. X(at) — a H X(t), for any a > 0, H must be in (0,1]; see [12].) 
Therefore having wavelets of vanishing moments of order 2 is necessary to cover the range (0, 1] 
for the Hurst parameter. Note that Haar (having p = 1) wavelets would work only for H 6 (0, |] . 

Now let us consider the numerator of (|2.3p . First we show that the eigenvalues of C/JE{Sj} 
are bounded. By the Gershgorin circle Theorem, the eigenvalue corresponding to a row is not 
different from the corresponding diagonal element by more than the sum of the other elements 
in the row, i.e., 

|A~i - 1| < \Cin\/C n , where C n = ®{S]}- (17) 

Since p(k) decays as k 2p ~ 2H , the sum in (|17p approaches a constant in the limit as Nj — > 00 for 
any H if the wavelet has at least two vanishing moments. Therefore 

Aj < K, 

for some constant K > 1 independent of Nj. (Note that Aj > 0.) Thus 

Nj 

^ Af < NjK 3 . 

i=i 

Hence, in the limit, J in (|2.3|) is given by 

{2K 2 /l{H)f/ 2 



J 



and goes to zero. From the Berry-Essen Theorem we conclude that 

*i E&A,fe 2 -i) 

tends to a JV(0, 1) random variable in distribution. Thus, asymptotically, Sj is given by 



Sj±lE{Sj}(l + ej 
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where e,- is Af(0, 1) 



One can also show that the Sj's are asymptotically jointly normal, by showing that ^ ■ cijSj 
is asymptotically normal for any {a,j)i<j<M (where M is the number of scales) using the same 
line of argument as above. From the variance of the above sum one can find an expression for 
the asymptotic normalized covariance of (Sj): 



D 



j. i 



Cov(Sj, Si) 1 
E{Sj}]E{Si} nZoo yHSljNi 



(18) 



where Ni is the number of detail coefficients at scale i, and N is the total number of data points. 
The asymptotic distribution of log 2 (5j) can be derived exactly the same way as in [36] for the 
pure ffim case, with Haar wavelets as the analyzing wavelets; therefore we will not repeat this 
analysis here. The distribution of log 2 (Sj) is given by 



log 2 (^)^log 2 (iE{^}) + e Jj^ : 



where tj is A/"(0, 1) 



□ 



In view of Theorem (|2.1[) we can use the generalized least squares estimate to estimate the 
Hurst parameter. If we denote c := log2(cr 2 K(H)), and h := 2H + 1 then the generalized least 
squares estimate of b = [c, h] T is given by 



b = (X 1 D X)~ X D~ M 
where M = [log 2 (5i), log 2 (Sj)] T , (J = log 2 (iV)), D is given by ^8]) . and X is given by 



(19) 



X 



1 1 
1 2 

1 J. 



We have 



and 



E{c} = log 2 (a 2 K(H)), 
E{h} = 2H + 1, 



, -l 



1 

N 



E{bb T ] = (X T D- 1 Xln{2) 2 ^ 
for large N. In view of these we have the following estimator for the Hurst parameter: 

H=(h-l)/2, (20) 

with variance 



Var(H) ~ 1/{4N) 



(21) 



for large N. 



In the next section we will allow a in ([8]) to be stochastic. In particular we take a to be any 
stochastic process having sufficient regularity. 
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Figure 4: The periodic function g estimated from the S&P 500, whose one period is a good 
representative of the intraday variability of the index. 




2.4 Robustness to Seasonalities and Volatility Persistence 

We first present an empirical verification of robustness to seasonalities which is followed by a 
theoretical verification of robustness both to seasonalities and to volatility persistence. 

We will denote the seasonal component with g(x), where x > is the time from the beginning 
of the segment under discussion. (Since the seasonal component is deterministic, we will denote 
it by g(t) instead of at to avoid confusion.) In examples such as g(x) = (x + b) q , b denotes the 
beginning of the segment. 

When (pQ) is implemented with a periodic g given by Fig. 01 which represents actual intraday 
variability, then the Hurst estimates do not change significantly. (The intraday variability en- 
velope g of Fig. S]was estimated from the S&P 500 index data as in [22] .) The Hurst estimates 
also do not change when the amplitude of this periodic component is changed be a factor of 100. 
Estimates of the Hurst parameter for segments of length 2 15 are shown in Fig.[5l (Compare with 
Fig. I2.2p . The mean of the estimates over the segments is 0.5912, and the standard deviation is 
0.0020. 
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Figure 6: The log-scale spectrum for ([T]) with g(x) = log(x + 6), g{x) = {x + b) log(x + b) and 
g(x) = (x + b) 2 . 



The scale spectra of ( ([T])) corresponding to various g's, namely g{x) = log(x + b), g(x) = 
(x + b)log(x + b), g(x) = (x + b) 2 , are plotted together on the same graph for comparison 
(Fig. [6]). One immediately notices that the slope of the scale spectrum is invariant to the choice 
of g for these examples; only the amplitude changes with g. For the realization of fBm shown in 
Fig. [TJ the mean and the std. for this parameter (over the segments) are: 0.5930, 0.0088; 0.5942, 
0.0642; and 0.5953, 0.0062 for g(x) = log(x + b), g{x) = (x + b) log(x + b), and g(x) = (x + b) 2 
respectively. 

Path properties play an important role in the robustness of the estimator developed in the 
previous section in the case of stochastic volatility. First note that paths of B H are almost surely 
Holder continuous of order A for all A < H, due to the Kolmogorov-Centsov Theorem ([26). 

The following result due to Ruzmaikina [41J and Zahle [48j gives the path properties of the 
stochastic integrals of a certain class of processes with respect to fBm. 

Lemma 2.2 Suppose a is a stochastic process with almost surely Holder continuous paths of 
order 7 > 1 — H on the interval [0, T] . Then the integral 



Jo 

exists almost surely as a limit of Riemann-Stieltjes sums. Furthermore, the process I is almost 
surely f3-Hdlder continuous on [0,T] for any (3 < H . 

Note that a does not have to be adapted with respect to the natural filtration of B H . For 
H > 1/2 an example of a satisfying the conditions of Lemma 12.21 is the Wiener process. Any 
continuous periodic function also satisfies the assumptions of this theorem. (This is a rather 
straightforward example, but we cite it due to its relevance to seasonality issue.) 

The following lemma is the key result for robustness; it gives a bound on the wavelet detail 
coefficients ([9]) for functions with certain regularity. 

Lemma 2.3 (See ]28$) A function f is Holder continuous of order 7 if and only if the scale 
coefficients corresponding to f satisfy 




(22) 



dj(k)\ < A2^< +1 / 2 \ V(j,fc)GZ 2 
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for some A < oo. 

The Holder continuity exponent of a function is related to its finer scales; therefore one must 
use the scale coefficients defined in ([9]) for j < 0. Using Lemma [2T3l we have the following result. 

Lemma 2.4 Suppose f is a function that is Holder continuous of order A < H and Sj is its 
scale spectrum. Then 

Sj < C 7 2^ 27+1 \ Vj G Z, V 7 < H, (23) 
for some C 7 6 (0, oo) and moreover i23\) does not hold for 7 > H for infinitely many j G Z_ 

and 

hminf W& = 2H+1. 

Let us summarize the results of this section in the following theorem. 

Theorem 2.2 Suppose a is a stochastic process with almost surely Holder continuous paths of 
order 7 > 1 — H on the interval [0, T]. Then there exists a random variable C 7 (u;) G (0, 00) such 
that the scale spectrum of the integral [22]) satisfies 

Sj{uj) < C 7 (cj)2^ +1 \ Vj G Z, V7 < H, (24) 

almost surely. Moreover {2J$ almost surely does not hold for 7 > H for infinitely many j G Z_ 
and 

liminf log(5 i M) =2ff + 1; 

almost surely. 

In Section 12.31 t ne domain of the wavelet is taken to be on the order of the mesh size of the 
discrete samples of the data. However the sample path properties show themselves in the finer 
detail coefficients (j < in ([9])). Then, using the fact that for any function / that is A-H61der 
continuous, g{t) = f(at) is also A-H61der continuous, it can be seen that a scale spectrum for 
the finer scales can be obtained from a scale spectrum corresponding to coarser scales. Letting 
J = log 2 (iV) + 1, where N is as in ([10]), we define ip'(t) = 2~ J / 2 ij){at/2 J ). Then -0' is a wavelet 
which has the same number of vanishing moments as tp. Taking a = 1, and defining the scale 
coefficients as 

/+00 
ij;'(2- j t - k)Z(t)dt, j G Z_ 
-00 

and scale spectrum as 

we have Sj = Sj+j. So as the number of samples of the data increases (J — > 00), we can consider 
finer and finer detail coefficients and the corresponding scale spectrum. 

Since the log scale spectra corresponding to the synthetically constructed data for various 
kinds of seasonalities (see Fig. [6|) and corresponding to the real data were linear, from Theorem 
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I2.2l we can conclude that a linear regression is an accurate way of estimating the Holder continuity 
exponent H of a sample path. This technique has been employed by Arneodo [6] for estimating 
the multifractal spectrum of a given sample path. (Also see |28|). Since the estimator given by 
(|19p and (|20p is a linear weighted least squares fit to the scale spectrum (giving more weight to 
the smaller scales, the weighting factor being proportional to the number of scale coefficients at 
the given scale scale) it is equal to the multifractal spectrum estimator of Arneodo. 

3 Hurst Parameter Estimation for the S&P 500 

After segmenting the samples of \og(Pt/ Pq) from our S&P 500 data set into dyadic segments, 
we estimated the Hurst parameter for each of the segments using the estimator given by (|20p . 
(Note that ./V is the number of points in a given segment.) 

For each segment (|2Q|) requires computing the scale spectrum (jlOp which further requires the 
computation of the detail coefficients §§§ for every scale. It may seem that this is computationally 
expensive for high frequency data; however due to the pyramidal algorithm described in Section 
IA. 1.21 this is not an issue. The pyramidal algorithm calculates the wavelet coefficients for any 
number of scales using octave filter banks given the initial approximation coefficients. Therefore 
the detail coefficients need only be computed at the initial scale. The detail coefficients at higher 
scales are computed from these initial coefficients via the pyramidal algorithm, which uses only 
the approximate coefficients of the preceding scale for calculating the detail coefficients of the 
next scale. 

In our model we want to introduce the flexibility of having a variation in H. If we further 
partition the segments into smaller segments of equal length and estimate the Hurst parameter 
for these smaller segments, we expect to see noise in our estimates due to the noise introduced 
by making the segmentation length smaller. To be able to make a comparison between the 
Hurst estimates corresponding to different segmentation levels we must filter out the extra noise 
introduced. The following section introduces a method to remove this finite segmentation noise. 

3.1 Filtering the Finite Segmentation Noise 

We know that the log-scale spectrum method yields an asymptotically efficient estimator, and 
thus having smaller segment lengths will introduce noise into the estimates. To deal with the 
noise due to finite segmentation length, we will follow the approach of Papanicolaou and Solna 
[36], which we now review. 

On letting h % denote the slope of the log-scale spectrum for the i th segment, we model it as 

U = h i + c, 

where h l is the true slope for the ith segment (h l = 2H l + 1), and is a random variable that 
models the finiteness of the segments. The error term will be assumed to have zero mean, and 
for different segments the error terms will be assumed to be uncorrelated, i.e. 

//•••KV } o. / / j- 

Here it is assumed that (H l ) is a stationary stochastic process independent of the fBm. On 
assuming that the slope process is exponentially correlated, lh denotes its correlation length, 
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and a\ denotes its variance, we have 

C h (i,j) = Effi -Eih'Mh? -E{h j })} = a 2 h exp(-L\i-j\/l h ) 
where L is the segment length. 

Here we will give a minimum variance unbiased linear estimator for the slope process (h l ). 
Let K denote the vector whose components are the estimates (h % ), and let K denote the vector 
whose components are the corresponding realizations of the slope process (h l ). We want to find 
a filter such that 

E{\\ TK-K || 2 } 
is minimized under the constraint that the mean is preserved, i.e., 

TE{K] = E{K}. 

It can be shown that T is given by 

T = (C h + C c )- 1 [C h + u T ®E{K}], (25) 
where the vector u = (ui) is given by 

Eiti} - E{K T }(C h + Cc)- 1 ^ 

7/ ■ — _ _ 

E{KT}(C h + C c )-iE{K} 

and is the diagonal covariance matrix of the estimation errors £\ Here Chi denotes the ith 
column of Ch- To be able to implement V one must estimate cr? and lh of (|3.ip . and the variance 
a\j of the noise process. For this purpose we will examine the empirical variogram of the slope 
process (h % ). The variogram at lag j is given by 

1 J ' j 
V(j)= ,/ ^ Y(h k+ i-h k ) 2 

where J is the number of segments. Since 

JE{V(j)} = - eM-L\j\/k) + *l (26) 

fitting (by a weighted least squares fit) the left-hand side of (|26|) to the empirical variogram 
yields the estimates for ah, cr^ and lh- Here it should be noted that the initial values must be 
chosen carefully. The most important parameter seems to be the mean correlation length Z^, 
and we choose it to be longer than the segments used in estimating the slopes in order to have 
approximate stationarity relative to segmentation. Also note that it is necessary to perform a 
weighted fit to the empirical variogram, because there are finitely many segments, and therefore 
the empirical variogram is closer to its expected value for smaller lags. 

We will now illustrate the power of filtering in removing the effects of noise due to finite 
segmentation length on one realization of the synthetically created fBm of Fig. [TJ We partitioned 
the data into segments of length 2 13 , estimated the Hurst parameter for each of the segments, 
and then applied filtering. (TP where T is given in (|25p .) The Hurst estimates we obtained 
with and without filtering are given in Fig. [71 The standard variation without filtering is 0.0283, 
and the standard variation after filtering is 0.0097; so clearly we mitigate to the finite segment 
length effects by filtering. 
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Figure 7: Hurst estimates for the realization given in Fig. [T] for 64 segments of length 2 13 with 
and without filtering. 
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3.2 Results on the S&P 500 Index 

We now turn our attention to the analysis of the S&P 500 index. As noted above, we consider 
data taken at one-minute intervals over the course of 11.5 years from January 1989 to May 2000. 
(We take the closing price of each minute.) The data consists of 1,128,360 observations, which 
is on the order of 2 20 . 

When the data is segmented into 275 segments of length 2 12 (approximately two weeks) and 
the above methodology is applied, we obtain the Hurst parameter estimates shown in Fig. [SJ 
(The mean is 0.6156, and the standard deviation is 0.0531, which supports the idea of local 
variation, i.e. the Hurst parameter varies significantly from segment to segment.) 

Alternatively, when the data is segmented into 137 segments of length 2 13 (approximately four 
weeks), we obtain the Hurst parameter estimates shown in Fig. [9j (The mean is 0.6027, and the 
std. is 0.0504.) 
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Figure 9: Hurst parameter estimates for the S&P 500 data with segment lengths of 2 
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Figure 10: Hurst parameter estimates for the S&P 500 data with segment lengths of 2 
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Similarly, when the data is segmented into 68 segments of length 2 14 (approximately four 
weeks), we obtain the Hurst parameter estimates shown in Fig. [TUJ (The mean is 0.6011 and 
the std. is 0.0487.) And, finally, when the data is segmented into 34 segments of length 2 15 
(approximately eight weeks), we obtain the Hurst parameter estimates given in Fig. [TTJ (The 
mean is 0.6008 and the std. is 0.1821.) 

If we plot these results on the same axes as shown in Fig. [12] we will arrive at the significant 
observation that the length of a stationary segment is 2 , which corresponds to approximately 
2 months. That is, when the segments are of length 2 15 , the nonstationarity is dominant. 

3.3 Increase in the Market Efficiency 



From Fig. [TUl one sees that, although the Hurst parameter of this data set is significantly above 
the efficient markets value of H = i, it began to approach that level in 1997 (segment 50). We 
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Figure 11: Hurst parameter estimates for the S&P 500 data with segment lengths of 2 
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Figure 12: Hurst parameter estimates for the S&P 500 data with various segment lengths. The 
strongly varying graph corresponds to estimates for segment lengths of 2 15 points. The graphs 
corresponding to the segment lengths of 2 12 , 2 13 , 2 14 can be distinguished from the line intensity, 
where the intensity decreases as the segment length increases, moreover the graphs are labeled 
by log2 of the corresponding segments length. 

conjecture that this behavior of the market might be related to the increase in Internet trading, 
which had the three-fold effect of increasing the number of small traders, the frequency of trading 
activity and the availability of market data. This observation is modeled in |10j . with a simple 
microstructure model for the price evolution of a financial asset where the price is driven by the 
demand of many small investors whose trading behavior exhibits "inertia" . This means that the 
agents trade the asset infrequently and are inactive most of the time. It is shown that when the 
price process is driven by market imbalance, the logarithm of the price process is approximated 
by a process of the form ([8]). Moreover it is shown that as the frequency of trading increases, 
the price process can be approximated by geometric Brownian motion, which is consistent with 
the above comments. 
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4 Conclusion 



In this paper we have developed a method to investigate long range dependence, which is quan- 
tified by the Hurst parameter, in high frequency financial time series. Our method exhibits 
robustness to the non-stationarities that are present in the data, e.g. seasonal volatility, fat- 
tailed distributions of the increments, and possible variations in the Hurst parameter. (In fact, 
the Hurst parameter reflects the relative frequency of the trading activity of the market par- 
ticipants, and hence variations in the Hurst parameter are expected |10j.) The segments of 
stationarity for the Hurst parameter are byproducts of this analysis. They are found to be 
approximately two months in duration for S&P 500 index data sampled at one minute intervals. 
Strikingly, the Hurst parameter was around the 0.6 level for most of the 1990s, but dropped 
closer to the efficient markets level of 0.5 in the period 1997-2000, coinciding with the growth 
in Internet trading among small investors. 

A Appendix 

A.l Wavelet Theory 

For a more detailed treatment see |17] . |28j or |34j . 
A. 1.1 Multi-resolution Analysis 

A wavelet tfj is a function mapping 1R to IR such that the dilated and translated family 

^•, fc (t) = 2-i/ 2 ^(2-H - k) for (j, k) G Z 2 , 

is an orthonormal basis of L 2 (H). A wavelet is defined via a scaling function through multi- 
resolution analysis (MRA). A sequence of closed subspaces {Vj,j G Z} of L 2 (IR) is an MRA if 
the following six properties are satisfied: 

n vj = w> 

jez 

Closure(|J Vj) = L 2 (M), 
Vj+l C Vj, 

V(j, k) G Z 2 , f(t) G Vj & f(t - Vk) G Vj, 
Vj G Z, f(t) G Vj O /(l) G V j+1 , 

and, there exists a function <j>(t) in Vq, called the scaling function, such that the collection 

{<p(t -k),ke Z} 

is a Riesz basis for Vq. 

It follows that the scaled and shifted functions of the scaling function 

{(j)j,k(t) = 2- j/2 cp{2- j t -k),k£Z} 
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is an orthonormal basis of Vj for all j. 

Orthonormal wavelets carry the details necessary to increase the resolution of a signal ap- 
proximation. The approximations of a function / 6 L 2 (H) at scales 2 J and 2 J_1 are respectively 
equal to its orthogonal projections onto Vj and Vj—\. Let Wj be the orthogonal complement 
of Vj in Vj-i, i.e. Vj-\ = Vj ffi Wj. Then the orthogonal projection of / onto Vj-i can be 
decomposed as the sum of orthogonal projections onto Vj and Wj. The projection onto Wj 
provides the details that appear at scale 2 3 but which disappear at the coarser scale 2 J . One 
can construct an orthonormal basis of Wj by scaling and translating a wavelet ip S V$ , and show 
that the family given by (|A,1.1|) is an orthonormal basis for L 2 (H). Since rp and <p are i n 
and by the properties of MRA we have: 

M 

<j)(t) = v / 2^/^(A:)0(2t - k), 

k=0 

and 

M 
k=0 

where h is a low-pass filter satisfying some admissibility conditions (see [28]), and g is the 
conjugate mirror filter of h: 

g{n) = {-l) 1 - n h{l-n). 
Therefore wavelets are specified via the scaling filter h. 

Wavelets are capable of removing nonstationarities because they have vanishing moments. We 
say that ip has p vanishing moments if it is orthogonal to any polynomial of degree less than p: 

/+oo 
t k ip(t)dt = for < k < p. (27) 
-oo 

The most versatile wavelet family is the family of Daubechies compactly supported wavelets, 
which are enumerated by their number of vanishing moments. The Daubechies compactly sup- 
ported wavelet with p = 1 is the Haar wavelet, which is the only wavelet in this family for which 
an explicit expression can be found. In our analysis we used Daubechies compactly supported 
wavelets with p = 2. 

A. 1.2 Pyramidal Algorithm (Mallat Algorithm) and its Initialization 

Let us denote the projection of a function / G L 2 (M) onto Vj and Wj respectively by 
cij(k) =< f,(j)j t k > arid dj(k) =< f,ipj,k >> k G Z, 

where < •, • > denotes the standard 1? inner product. The pyramidal algorithm calculates these 
coefficients efficiently with a cascade of discrete convolutions and subsamplings. Denote time 
reversal by x(n) = x(—n) and upsampling by 

. . I x(n) if n is even, 
x[n) = < 

I if n is odd. 

The pyramidal algorithm is then given by the following theorem: 
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Theorem A.l (Mallat [28]) Decomposition: 



2p)aj(n) = ctj * h(2p) 

2p)chj(n) = chj *g(2p). 
Reconstruction: 

+00 +00 
a j(p) = Hp - 2n ) a j+i( n ) + Yl 2n K+i( n ) 

n=— oo n=— oo 

= a j+1 * h(p) + d j+ i*g(p). 

To compute the detail coefficient at scale j, we use only the approximation coefficient at the 
previous scale aj-%. Note that the domain of h and g are compact if we use Daubechies wavelets 
with compact support. 

The pyramidal algorithm assumes initially that it is given the wavelet coefficients at a fine 
scale, and proceeds to compute the detail coefficients at higher scales. The initial sequence ao(k) 
requires the evaluation of a continuous time integral, 

oo(*0 = / f(t)<t>(t-k)dt, (28) 

JR 

where 4> is the scaling function. 

Typically what is done is to set ao(k) = f(k), an ad-hoc procedure that will almost certainly 
introduce errors. (An exception is the case in which coiflets [28] are used, since in that case the 
scaling function has vanishing moments of the same order as the wavelet.) Here, we will replace 
the continuous time integral by a sum: 

a o{k) = ^2f(n)(j)(n - k). 

n 

Note that this sum is equal to the integral of ([28]) when f(t) is a low order polynomial. (See [31] •) 
An explicit expression for <p is not known, however (j> at the integer points can be calculated 
from the defining recursion for the scale function: 

M 

<j)(t) = V2^2h(k)(f>(2t-k). 

There are better methods one could apply for the initialization, as suggested by Beylkin et al. 
in [T[]- 



n=— 00 
+00 



Remark A.l We use only the vanishing moments property (Wl) and compactness of the support 
of the wavelets to prove Theorems \2.1\ and \2.2\ However we need the orthogonality introduced 
by the multiresolution analysis for implementing the pyramidal algorithm introduced in Section 
\A.1.2\ Moreover in our analysis of the S&P 500 index data we work with compactly supported 
wavelets to further increase the efficiency of the pyramidal algorithm. 
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